_____________________________________________________________________________

      Copyright 1994, Silicon Graphics, Inc. All Rights Reserved.

      THIS DOCUMENT CONTAINS UNPUBLISHED INFORMATION OF SGI

      The copyright notice above does not evidence any actual or
      intended publication or disclosure of this document, which
      includes information that is the confidential and/or
      proprietary, and is a trade secret, of Silicon Graphics, Inc.

      ANY DUPLICATION, MODIFICATION, DISTRIBUTION, PUBLIC PERFORMANCE,
      OR PUBLIC DISPLAY OF THIS DOCUMENT OR ANY OR ANY PORTION OF THIS
      DOCUMENT, WITHOUT THE EXPRESS WRITTEN CONSENT OF SILICON GRAPHICS,
      INC. IS STRICTLY PROHIBITED.  THE RECEIPT OR POSSESSION OF THIS
      DOCUMENT DOES NOT CONVEY ANY RIGHTS TO REPRODUCE, DISCLOSE OR
      DISTRIBUTE ITS CONTENTS, OR TO MANUFACTURE, USE, OR SELL ANYTHING
      THAT IT MAY DESCRIBE, IN WHOLE OR IN PART.
 _____________________________________________________________________________

  
        ~4Dgifts/toolbox/src/exampleCode/networking/HIPPI/HIPPIperf.txt


                         HIPPI Performance on IRIX 5.2

                                      by

                                  Paul Reilly
                                 MSD Marketing
                             Silicon Graphics, Inc.

                                 June 24, 1994


    SUMMARY: This white paper explores the performance envelope of Silicon 
    Graphics IRIS HIPPI version 1.0 as run on the CHALLENGE(TM) and Onyx(TM)
    computer systems running IRIX version 5.2.  This report is for the HIPPI
    expert.


    Acknowledgments:
    We wish to thank the following people who have contributed to this 
    white paper (in alphabetic order):
     Scott Bovenizer, Lise Garrett, Thomas Skibo, Rob Warnock, and Audy Watson.

    (C) Copyright 1994, Silicon Graphics, Inc. All Rights Reserved

    HIPPI Performance on IRIX 5.2
    Silicon Graphics, Inc.
    Mountain View, California

    IRIS, Silicon Graphics, and the Silicon Graphics logo are registered 
    trademarks and CHALLENGE, Onyx, POWER Channel and IRIX are 
    trademarks of Silicon Graphics, Inc.
    NFS is a registered trademark of Sun Microsystems, Inc.
    UNIX is a registered trademark of UNIX System Laboratories, Inc.

    Introduction.
    =============
    Once upon a time, we had one of those `muscle' cars with 427 cubic inch 
    engines, four-on-the-floor, traction control, and a radio blaring Beach 
    Boys music at 110 dB.  Of course, that was a very long time ago, back 
    when the Beach Boys were still boys. 
      Much has changed in the intervening years. Those originals gave way to 
    a series of progressively more `sensible' cars until our present vehicle
    is barely capable of exceeding legal speed limit.  Yet the memories 
    linger of the weekends at the drag strip, the roar of powerful engines, 
    the smell of burning rubber, the dreams of breaking a ten second elapse 
    time....
      While not quite as exciting as burning rubber for a hundred yards, we 
    just had the opportunity to relive a little of that excitement. Silicon 
    Graphics has recently announced a new IRIS HIPPI card for the 
    CHALLENGE(TM) and Onyx(TM) families of computer systems, and we were 
    asked to see just what it can do. A one word summary of the results is 
    Awesome! 

    The IRIS HIPPI Interface:
      A detailed description of the IRIS HIPPI interface in beyond the scope 
    of this white paper--the documentation does an excellent job. Therefore,
    we recommend that if you are interested in the implementation details, 
    you obtain a copy of IRIS HIPPI Administrator's Guide, Document Number 
    007-2229-002 which gives a fairly detailed description of the board's 
    hardware implementation. 
      Of equal importance is the IRIS HIPPI API Programmer's Guide, Document 
    Number 007-2227-001. This describes the various software interfaces 
    available to the IRIS HIPPI interface, including the HIPPI-PH layer 
    which equates to raw mode I/O. You can get copies of this documentation 
    from your local Silicon Graphics sales representative. 
      Since most of you are probably interesting in the HIPPI-PH interface, 
    we have attached example programs which use it in the appendices of this 
    white paper. These are sink and blast. Neither is a complete, finished 
    program. They are merely working examples, or so-called scrub programs 
    which demonstrate how one should actually program the HIPPI-PH interface
    of IRIS HIPPI. However, they are working programs which you can try out 
    on your IRIS HIPPI cards and tinker with to see just what performance 
    you can get. In fact, we used them as part of the testing that went into 
    this report.
    
    The Machines:
    =============
      Naturally enough, we should start this performance report with a 
    description of the equipment used. The systems used were a base model 
    CHALLENGE L and a desk side Onyx, each with two 150 Mhz processors, 256 
    MB of memory (one-way interleaved), and a POWER Channel(TM) 2 (IO4). One 
    HIO port had a SCSI adapter installed, while the other HIO port had the 
    IRIS HIPPI card attached to it. The two systems were directly connected 
    by HIPPI cables with no switch in between. 
      The disk subsystem, consisted of six fast and wide SCSI disks attached
    to each system as a six-way-striped logical volume, with the disks 
    striped three each on each of two controllers.
      Thus, the two systems were fairly common configurations, ones that you 
    yourself are likely to have. As we will see, the disk configuration was 
    the weak link--the IRIS HIPPI card was easily able to stay well ahead of 
    this disk configuration.

    The Software:
      Each system was loaded with the released version of IRIX(TM) 5.2 and 
    version 1.0 of the IRIS HIPPI driver and related software. The only 
    tuning of the software was that the default TCP window/socket space was 
    increased to 512 KB. That is, in /var/sysgen/master.d/bsd 

unsigned long tcp_sendspace = 60 * 1024; /* must be < 256K */ 
unsigned long tcp_recvspace = 60 * 1024; /* must be < 256K */

    was changed to

unsigned long tcp_sendspace = 512 * 1024; /* must be < 256K */
unsigned long tcp_recvspace = 512 * 1024; /* must be < 256K */

    Please note that the comments on these two lines are incorrect, the 
    maximum size is 512 KB.
    
    On to the Drag Strip:
    =====================
      Naturally, whenever you have a shiny new hot rod, the first thing you 
    want to do is to see just how fast it really is--so you take it to the 
    local drag strip and burn rubber. The UNIX equivalent of the drag strip 
    is raw mode I/O, which in the case of IRIS HIPPI translates into 
    HIPPI-PH.
      As noted above, we used blast and sink to wring this interface out. 
    They are attached in the appendices, and will be made available on a 
    forthcoming developers toolbox.
      If you read the listings, you will find that blast, the transmitting 
    program, has several options.
      The first is -1 or not. This controls whether it forks into two 
    processes (-1 turns forking off). Since there are latency issues that 
    effect performance such as doing the mpin and munpin of physical memory,
    it is desirable to run at least two interleaved processes if possible. 
    Since this interface runs only on a multiprocessor computer (there are 
    always at least two CPUs), it makes sense to do this.
      Since we were using only one HIPPI interface per system and no switch, 
    the -D or -I switches, which define the device name and ifields, are not 
    relevant.  
      The final three options are the length of the write performed, the 
    number of packets to be sent and the number of times to repeat the test. 
    Various lengths of write were evaluated in powers of two from 256 to 2MB. 
    The upper limit of 2MB was based on the amount of hardware buffering on 
    the IRIS HIPPI board. It can handle a maximum of 2 MB. 
      The results of running sink and blast over the IRIS HIPPI PH interface.
    They are in megabytes per second as reported by blast. The left-hand-most 
    column is the write size (in bytes) used. The second column is the results 
    when -1 is not used. This means that there was a fork and subtasking was 
    used. The third column is the results when -1 (no forking) was used.
    

        I/O Size              (forking enabled)       (forking disabled)
        (bytes)               (MB/s)                  (MB/s)
            256                0.32                    0.32 
            512                0.64                    0.66 
           1024                1.36                    1.36 
           2024                2.64                    2.71 
           4096                5.28                    5.13 
           8192                9.52                    9.52 
          16384               16.62                   16.98 
          32768               22.15                   24.16 
          65536               33.60                   31.88 
         131072               46.30                   37.88 
         262144               61.88                   50.61 
         524288               70.42                   62.19 
        1048576               74.29                   70.13 
        2097152               77.94                   73.21 

                                   Table 1
         Results of running sink and blast over IRIS HIPPI PH interface

    
      As you can see, the size of the I/O clearly makes a difference. The 
    shorter I/O lengths gave fairly poor performance until the 512KB size 
    was used. At that point, the performance gradually increased until the 
    maximum of 78 MB/second was reached with 2MB writes and reads.
      The second observation is that enabling forking to share the CPU 
    compute load between two processors is a win. The performance difference 
    between the two modes of running blast and sink is slight until you 
    start using I/O with lengths greater than 64 KB. After that, there is a 
    five to ten percent gain with forking enabled. 
      One point not shown in Table 1 is that the IRIS HIPPI interface can 
    actually transmit faster than it can receive, and so the results in 
    Table 1 are the receive performance data. 
      If you are interested in the transmit performance of IRIS HIPPI, you 
    can let the interface run free by simply not starting the sink task on 
    the receiving system. If you were to do this, then the rate for 2 MB 
    writes (using blast alone) is close to 92 MB/second. 
    
    Cruising the TCP/IP Expressway:
    ===============================
      While HIPPI-PH is quite useful for a number of applications, all of 
    them must be specially written to use it. There are, however, a large 
    number of common programs which use the IP networking stack. Among them 
    are rcp, ftp as well as NFS. 
       We will start with NFS. NFS is a request/response protocol in which 
    the client sends a request and then waits until the response (usually 
    just a 8KB block of data) is fulfilled. History has shown that NFS is 
    self-limiting, generally in the order of 800,000 bytes per second, 
    which is roughly the bandwidth of ethernet. When FDDI first came out, 
    everybody was surprised when NFS performance was only about 10% faster 
    on FDDI than it was on ethernet. The NFS implementation is being further 
    developed to improve its performance on these higher bandwidth media. 
    These enhancements will be incorporated into version 5.3 of IRIX. 
      On the other hand, TCP is well suited to high speed interfaces such as
    HIPPI. The reason is that it uses a sliding-window buffer to keep the 
    data flowing almost as though there was a conveyor belt set up between 
    the sender and receiver. It also has a number of very clever 
    optimizations. One is `slow-start' which starts the data transfer over 
    TCP at a fairly low rate and gradually (for a computer at least) 
    increases the flow of data until either it finds that it lost a packet 
    or it reaches full-throttle. There is also something called MTU 
    discovery which permits two TCP entities to negotiate the size of 
    packets to be sent between them. Thus, TCP is self-governing, adapting 
    itself to conditions on the network. Tuning Silicon Graphics' TCP for 
    Speed. 
      Recently, RFC 1323 was published which includes additional TCP 
    performance improvements. However, some of them, such as large window 
    sizes, will actually cause performance losses in older systems that 
    don't have them. So to keep the peace, we ship IRIX with these in it 
    and then turn them off so as to keep compatibility with the older 
    versions of TCP. The controlling variables can be found in 
    /var/sysgen/master.d/bsd. They read:
    
/* TCP window sizes/socket space reservation */ 
unsigned long tcp_sendspace = 60 * 1024; /* must be < 256K */ 
unsigned long tcp_recvspace = 60 * 1024; /* must be < 256K */
/* TCP large windows (RFC 1323) control. */
int tcp_winscale = 1; 
int tcp_tsecho = 1;

    The last two variables are set to true and should be left that way. The
    way we actually control whether RFC 1323's features are turned on is by 
    the window sizes. If they are set to 60 * 1024 or less, then the 
    assumption is that the RFC 1323 features should not be used. On the 
    other hand, if the window size as set to greater than 60 KB, then they 
    should be used.
      There are two ways of doing this. First, you can, as we did, edit the 
    file and increase tcp_sendspace and tcp_recvspace. And as noted above, 
    the correct upper limit is 512KB, not 256KB. What this does is set the 
    default TCP window sizes. You can also set the window sizes by a 
    setsockopt call. This is handy in that you can selectively increase the 
    TCP window size for a particular interface such a HIPPI without 
    increasing it for all your networking interfaces. We will get back to 
    this in a moment when we discuss ttcp below. The reason we chose to edit 
    the /var/sysgen/master.d/bsd file was that some programs, such as ftp, 
    do not have this setsockopt call, and so this forces you to increase the 
    window sizes by using the defaults defined in /var/sysgen/master.d/bsd.

    Memory-to-Memory TCP/IP performance over HIPPI.
      For sometime now, ttcp has been the acknowledged routine for testing 
    TCP/IP performance. It started out in the early 1980s as a program from
    BRL, then followed several paths to virtually every UNIX implementation
    that supports TCP/IP (which is virtually all of them.) The particular 
    version used in this test is the one to be found in the optional 
    eoe2.sw.ipgate subsytem that can be selected during inst. The source, or
    a reasonably facsimile, is distributed in 
    /usr/people/4Dgifts/examples/network, so if you don't have it on your 
    system, or you would like to play with it, you can find it there.
      While ttcp can be used to test UDP performance as well, we limited 
    ourselves to the TCP/IP mode. ttcp can also be used to do disk reads 
    and writes, but most people prefer to use rcp or ftp for that, so we 
    used ttcp solely in its memory-to-memory mode. That is, we used the -s
    switch.
      A few words of warning need to be made about both ttcp and TCP/IP. 
    First, ttcp can and does use all those neat features that make TCP/IP 
    run fast, including slow-start, MTU discovery, and a host of other 
    features. As noted, TCP/IP will spend several seconds when a transfer 
    is first started to `feel' out the network and determine just how fast 
    it can run. As we will see, very short runs--runs of less than 10 
    seconds--give spuriously low performance. As a rule-of-thumb, you 
    should never make a ttcp run of less than 30 seconds. We prefer 60 
    seconds as a minimum. Thus, you should always set the -n switch to some
    number which gives a run that last at least 30 seconds. Otherwise, you 
    will not have valid results. Now for those results.
    
    The ttcp Results.
    =================
      We ran ttcp in TCP mode and memory-to-memory mode only. That is to 
    say the generic ttcp command for receive was
     #ttcp -r -s -l####
    while the generic ttcp command for transmit was
     #ttcp -t -s -l#### -n100000 -b524288 [-D] hippi-hostname
    The values for -l#### varied from 256 to 65536 bytes by powers of 2, 
    with the exception of 61440 which has a special meaning that we will 
    explain in a minute. 
      For all buffer sizes except 61440 and 65536, we used -n100000. In the 
    case of -l61440 and -l65536, -n50000 was used instead. Thus, all tests 
    ran for at least 17 seconds, while the longer lengths ran for about one 
    minute. 
      The -b524288 option sets the TCP window size to 524288 (or 512K), 
    which could be used in place of the changes in /var/sysgen/master.d/bsd
    noted above, (-b sets the socket buffer size of SO_SNDBUF/SO_RCVBUF with
    a setsockopt call).
      In addition, we used the -D to set TCP_NODELAY on some of the tests to
    see what impact this feature would have on performance. Those results 
    are indicated in Table 2.  The results of running ttcp in memory-only-
    mode over IRIS HIPPI using TCP/IP. The various options for ttcp were as 
    noted above. The results are in kilobytes per second (i.e. 1024) as 
    reported by ttcp. The results for both the transmitting and receiving 
    task of ttcp are shown.


                      -D on          No -D 
       I/O Size       Receive        Transmit       Receive       Transmit 
       (bytes)             (kbytes/s)                     (kbytes/s)
         256           1055           1055           1587           1588 
         512           2463           2464           3498           3500 
        1024           5803           5805           6208           6211 
        2048          12143          12160          11786          11791 
        4096          22001          22007          32626          32654 
        8192          32972          32984          40666          40681 
       16384          38756          38763          41334          41341 
       32768          42235          42240          42908          42913 
       61440          47192          47196          46213          46216 
       65536          42100          42103          41992          41997

                                   Table 2
    Results of running ttcp in memory-only-mode over IRIS HIPPI using TCP/IP
   

      The results show that one can get very good TCP/IP performance over 
    IRIS HIPPI. The maximum performance was about 47 to 48 MB/second. Not 
    surprisingly, the performance increased as the size of the read/write 
    increased, with the exception of 60KB (61440) which is clearly the sweet
    spot in this series of tests. Then the performance falls off slightly at 
    64KB. This is not unexpected. Remember that the IP has a 16-bit length 
    field and so the size of a TCP/IP packet cannot be larger than 64KB. 
    However, since you still have to wrap the TCP portion of the transmitted
    frame with IP headers and trailers, you cannot really send a 64KB TCP 
    packet without fragmentation. Thus, 60KB is the sweet spot as it is the 
    largest hunk of user data that can be sent inside of 64KB TCP/IP 
    encapsulation and still be a multiple of 4KB, the page size. Thus, we 
    expected the maximum performance to be at 60KB because it would fully 
    utilize the available space inside of a TCP/IP packet, avoid any packet 
    fragmentation. The reason why the results for 64KB buffers was worse 
    than even the 32KB buffer was because there was fragmentation taking 
    place.
      Naturally, this leads to the question of what happens if you fail to 
    use a buffer that is a multiple of 4KB. Well, on the transmission side,
    the answer is not much for it is still possible to `page flip' the 
    output buffer down to the device driver and so avoid copying data. 
    However, on the receive side, you cannot do this indiscriminately as it 
    is possible to destroy information the user's program has stored next to
    the receive buffer. What has to happen in this case is that any part of 
    an incoming buffer that does not completely fill a virtual memory page 
    must be copied into the user's buffer. We did look at a number of tests
    in which we used a fairly large buffer size (60144 and 49152) that 
    wasn't a multiple of 4KB. The results were fairly consistent. If you 
    forced a copy of data because your buffer was not exactly a multiple of 
    4KB, the performance fell to about 27-28 MB/second, which still isn't 
    that bad when compared to the competition. 
      As for the TCP_NODELAY feature of ttcp, we are a little surprised that 
    the -D switch didn't give more of a difference than observed.
    
    Copying Files over IRIS HIPPI:
    ==============================
      While blasting data in HIPPI-PH mode and doing memory-to-memory 
    transfers with TCP/IP are interesting, there is a great deal of interest
    in disk-to-disk transfers with rcp and ftp. While both of these routines
    use TCP/IP, they also use the disks which add another level of 
    complexity, i.e. disk performance.  So far, we had five or six users of
    IRIS HIPPI call and complain about ftp performance in particular. They 
    usually couldn't get more than four or five megabytes per second. Our 
    first question is `How did you stripe your disks?' Inevitably, the 
    answer was `Huh?' That is to say, what they really were doing was 
    copying a file from a unstriped disk over IRIS HIPPI to another disk 
    which was unstriped. Since the typical SCSI-II disk drive can read or 
    write at between four to five megabytes per second, what they were 
    actually doing was simply measuring how fast the disk drives could do 
    I/O. Therefore, it is mandatory to use striped disks to get really good
    disk-copy performance over IRIS HIPPI. 
      Originally, we planned to get the largest disk farm we could to do 
    these tests. Unfortunately, a number of outside pressures limited us to
    a total of twelve disks. These were striped three disks each on two SCSI
    controllers, so we ended up with six-way striping. Under ideal
    conditions, we can often get about 14 MB/sec off of a SCSI controller 
    which has three disks on it, so it is likely that we could get 20 to 25
    megabytes off a the six-way striped configuration. As it turned out, 
    there were other problems that we ran into.
    
    Copying Files with rcp.
    =======================
      While most people who buy HIPPI interfaces tend to use ftp for copying 
    files, we decided look at rcp as well. We'll report those results first,
    and then go into ftp a good deal deeper.
      Our rcp copying tests were fairly simple. We used mkfile to create a 
    number of files, varying in size from 1 megabyte to 1 gigabyte. Then we 
    ran rcp to copy them over IRIS HIPPI using timex to see how long it 
    took. Next, we did a divide and, viola, we had how many megabytes per
    second rcp copied the files. 
      This was done on both the 6-way striped disks as well on single 
    unstriped disks to underscore that you really need to stripe your disks
    if you want performance. The results are in Table 3; however, before we 
    get to the actual numbers, we had better note that we did transform the
    numbers slightly so that they are comparable to ftp's reported 
    performance. As many of you know, ftp reports its performance in KB 
    (1024), but the number printed is in base ten. In other words 1000 KB, 
    or a megabyte, is really 1000 * 1024. Therefore, we converted the rcp 
    performance to the same format so that apples-to-apples comparisons are 
    possible to ftp performance. 
      File copy performance of various sized files over IRIS HIPPI using rcp
    on 6-way striped and unstriped disks. As noted above, data was converted
    to be comparable to ftp performance results. 
    

           File Size                  6-Way Striped     Unstriped
             1 MB (1024*1024)        1.11 MB/second   1.04 MB/second
            10 MB (10*1024*1024)     5.59 MB/second   4.36 MB/second
           100 MB (100*1024*1024)    7.32 MB/second   4.59 MB/second
             1 GB (1000*1024*1024)   7.45 MB/second   4.58 MB/second

                                   Table 3
         File copy performance of various sized files over IRIS HIPPI 
               using rcp on 6-way striped and unstriped disks

    
      Quite clearly, the results show that striping the disks is important.
    The performance for unstriped disks is limited to the disk's speed. In 
    the case of 6-way striping, we got as much as 7.45 MB/second using rcp.
    The lower performance shown for the 1 MB and 10 MB files is due to both 
    TCP's self-regulation features (slow-start, MTU discovery, etc.), as 
    well as to a rather set-up handshaking (user validation etc) that occurs
    between rcp and rcpd. It is safe, however, to say that rcp can get about
    7.45 MB/second at least with a striped disk as tested.
      While 7.45 MB/second for the six-way striped disks is a lot better 
    than the 4.5 MB/second number obtained from the unstriped disk, it is 
    still nothing like the 20 to 25 MB/second that we could have gotten. 
    The question is why? We explored this when we tested ftp performance on 
    the same disk and files we used to test rcp.
    
    Copying Files with ftp.
    =======================
      Obviously, we were a little disappointed with the file copying 
    performance we saw with rcp. So the first thing we did was to repeat the
    test with ftp, using the same disk configurations and file sizes we used
    in the rcp test. The results are shown in Table 4. We refer to this 
    version of ftp as `standard' for it is the version shipped with IRIX 5.2.
    As you will see, we ended up modifying ftp.  
    

           File Size                  6-Way Striped     Unstriped
             1 MB (1024*1024)        3.97 MB/second   3.35 MB/second
            10 MB (10*1024*1024)     4.62 MB/second   4.01 MB/second 
           100 MB (100*1024*1024)    5.91 MB/second   4.38 MB/second
             1 GB (1000*1024*1024)   7.07 MB/second   4.45 MB/second

                                   Table 4
          File copy performance of various sized files over IRIS HIPPI 
            using standard ftp on 6-way striped and unstriped disks
     

      Oh, well, so ftp is not any better than rcp. If anything, it has 
    slightly worse performance. Please note that the apparent improvement in
    the 1 MB file case, where ftp transferred data at 3.97 MB/second is 
    really because rcp is doing that set-up handshaking we noted, and ftp 
    isn't doing it. 
      Once again, we are less than totally pleased with the results. After 
    all, we have seen disk-to-disk copying done as much higher speeds than 
    7 MB/second when six-way striped disks were used. And we also know that 
    ttcp can run nearly 48 MB/second over IRIS HIPPI. However, those disk-
    file copying tests were not done with cp, but with special purpose 
    programs that took advantage of the features of IRIX. Perhaps the 
    problem is in ftp. As it turned out, it was.  About this time, we 
    decided to have a hard look at ftp. That is, we dug the sources out and 
    looked at what it was doing. And, guess what? It was doing its I/O, both
    to and from disk as well as to TCP/IP in 16 KB blocks. While this is 
    adequate for, say, copying files over the internet, it is nothing like 
    the software needed to drive megabytes over the HIPPI interface. 
      Careful examination of ftp's sources disclosed a number of other flaws
    that would limit disk as well as network performance. However, the 
    biggest problem was the 16 KB I/O being used. So, we got out our 
    software meat axe and hacked ftp a bit. 
      Actually, all we did was to increase the buffer size definition, which
    in turn, increased the nbyte parameter in the reads and writes because 
    they were defined in terms of sizeof(buffer). Thus, it was very easy to 
    change the I/O size being used not only to and from disk, but also over 
    the IRIS HIPPI interface as well. We chose two new buffer sizes: 60KB 
    and 960 KB. The 60 KB size was chosen because it is the sweet spot we 
    found with ttcp in Table 2, and 960 KB was was 16 times as large. In 
    hindsight, we should have used even bigger buffer sizes.
      First, the results of the 60 KB version of ftp. Both ftp and ftpd were 
    modified as noted above. Then we repeated the testing we did on the 
    standard version of ftp.  Both ftp and ftpd were modified to do I/O in 
    60KB buffers.  

    
           File Size                  6-Way Striped     Unstriped
          1 MB (1024*1024)           3.54 MB/second  3.41 MB/second
         10 MB (10*1024*1024)        7.21 MB/second  4.23 MB/second
        100 MB (100*1024*1024)       7.79 MB/second  4.43 MB/second
          1 GB (1000*1024*1024)      7.72 MB/second  4.39 MB/second

                                   Table 5
          File copy performance of various sized files over IRIS HIPPI 
               using ftp on 6-way striped and unstriped disks


      That's better! However, it isn't a whole lot better than rcp. However,
    it does show that the performance problem is not in IRIS-HIPPI but in 
    ftp itself. So, let's try the 960 KB buffer size.  Both ftp and ftpd 
    were modified to do I/O in 960KB buffers.

    
           File Size                  6-Way Striped     Unstriped
          1 MB (1024*1024)           3.32 MB/second  3.29 MB/second
         10 MB (10*1024*1024)        7.69 MB/second  4.32 MB/second
        100 MB (100*1024*1024)       9.51 MB/second  4.53 MB/second
          1 GB (1000*1024*1024)      9.52 MB/second  4.28 MB/second

                                   Table 6
        File copy performance of various sized files over IRIS HIPPI 
              using ftp on 6-way striped and unstriped disks

    
      Still better, but not quite 10 MB/sec. However, it is now very clear 
    that there are still other problems in ftp that need to be solved if we
    are to get warp-speed performance. We looked at them and decide to leave
    them alone as it would take a major rewrite of ftp to do what was really
    necessary. This consisted mainly on redoing the disk I/O buffering, and
    we frankly didn't have the time. However, there were a few things we 
    could do to get some idea just what performance we could eventually get.
    One was to cheat on reading the disk by using two copies of ftp and ftpd
    to read and write the same files at the same time. This is a bogus 
    measure of disk I/O for what would happen is EFS simply reads the second 
    stream of I/O out of the disk cache on the transmitting system. However, 
    it would nevertheless send two streams of data through TCP/IP and IRIS 
    HIPPI. Thus we could still measure the transfer rate over the network. 
    We got an aggregate of 15.66 MB/second when we did this.
     
    Summary of ftp performance.
    ===========================
      Thus, in summary, ftp performance is really limited by the way the 
    code is written. Clearly, much higher file transfer rates are possible 
    when even simple things like increasing the I/O buffer sizes are done.